Training a Tetris agent via interactive shaping: a demonstration of the TAMER framework

نویسندگان

W. Bradley Knox

Peter Stone

چکیده

As computational learning agents continue to improve their ability to learn sequential decision-making tasks, a central but largely unfulfilled goal is to deploy these agents in real-world domains in which they interact with humans and make decisions that affect our lives. People will want such interactive agents to be able to perform tasks for which the agent’s original developers could not prepare it. Thus it will be imperative to develop agents that can learn from natural methods of communication. The teaching technique of shaping is one such method. In this context, we define shaping as training an agent through signals of positive and negative reinforcement. In a shaping scenario, a human trainer observes an agent and reinforces its behavior through push-buttons, spoken word (“yes” or “no”), facial expression, or any other signal that can be converted to a scalar signal of approval or disapproval. We treat shaping as a specific mode of knowledge transfer, distinct from (and probably complementary to) other natural methods of communication, including programming by demonstration and advice-giving. The key challenge before us is to create agents that can be shaped effectively. Our problem definition is as follows:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces

While recent advances in deep reinforcement learning have allowed autonomous learning agents to succeed at a variety of complex tasks, existing algorithms generally require a lot of training data. One way to increase the speed at which agents are able to learn to perform tasks is by leveraging the input of human trainers. Although such input can take many forms, real-time, scalar-valued feedbac...

متن کامل

Reinforcement Learning from Demonstration and Human Reward

In this paper, we proposed a model-based method—IRL-TAMER— for combining learning from demonstration via inverse reinforcement learning (IRL) and learning from human reward via the TAMER framework. We tested our method in the Grid World domain and compared with the TAMER framework using different discount factors on human reward. Our results suggest that with one demonstration, although an agen...

متن کامل

Combining manual feedback with subsequent MDP reward signals for reinforcement learning

As learning agents move from research labs to the real world, it is increasingly important that human users, including those without programming skills, be able to teach agents desired behaviors. Recently, the tamer framework was introduced for designing agents that can be interactively shaped by human trainers who give only positive and negative feedback signals. Past work on tamer showed that...

متن کامل

Shaping Mario with Human Advice (Demonstration)

In this demonstration, we allow humans to interactively advise a Mario agent during learning, and observe the resulting changes in performance, as compared to its unadvised counterpart. We do this via a novel potential-based reward shaping framework, capable for the first time of handling the scenario of online feedback.

متن کامل

Interactive Demonstration of Pointing Gestures for Virtual Trainers

While interactive virtual humans are becoming widely used in education, training and delivery of instructions, building the animations required for such interactive characters in a given scenario remains a complex and time consuming work. One of the key problems is that most of the systems controlling virtual humans are mainly based on pre-defined animations which have to be re-built by skilled...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Training a Tetris agent via interactive shaping: a demonstration of the TAMER framework

نویسندگان

چکیده

منابع مشابه

Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces

Reinforcement Learning from Demonstration and Human Reward

Combining manual feedback with subsequent MDP reward signals for reinforcement learning

Shaping Mario with Human Advice (Demonstration)

Interactive Demonstration of Pointing Gestures for Virtual Trainers

عنوان ژورنال:

اشتراک گذاری